Xi-Vector Embedding for Speaker Recognition

نویسندگان

چکیده

We present a Bayesian formulation for deep speaker embedding, wherein the xi-vector is counterpart of x-vector, taking into account uncertainty estimate. On technology front, we offer simple and straightforward extension to now widely used x-vector. It consists an auxiliary neural net predicting frame-wise input sequence. show that proposed leads substantial improvement across all operating points, with significant reduction in error rates detection cost. theoretical our proposal integrates linear Gaussian model speaker-embedding networks via pooling layer. In one sense, i-vector Hence, refer embedding as xi-vector, which pronounced /zai/ vector. Experimental results on SITW evaluation set consistent over 17.5% equal-error-rate 10.9% minimum

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Graph-embedding for speaker recognition

Popular methods for speaker classification perform speaker comparison in a high-dimensional space [1], however, recent work [2] has shown that most of the speaker variability is captured by a low-dimensional subspace of that space. In this paper we examine whether additional structure in terms of nonlinear manifolds exist within the high-dimensional space. We will use graph embedding [3] as a p...

متن کامل

Phonetic Speaker Recognition with Support Vector Machines

A recent area of significant progress in speaker recognition is the use of high level features—idiolect, phonetic relations, prosody, discourse structure, etc. A speaker not only has a distinctive acoustic sound but uses language in a characteristic manner. Large corpora of speech data available in recent years allow experimentation with long term statistics of phone patterns, word patterns, et...

متن کامل

A Vector Quantization Approach to Speaker Recognition

CH2118-8/85/0000-0387 $1.00 © 1985 IEEE 387 ABSTRACT. In this study a vector quantIzation (VQ) codebook was system. In the other, Shore and Burton 112] used word-based VQ used as an efficient means of characterizing the short-time spectral codebooks and reported good performance in speaker-trained isolatedfeatures of a speaker. A set of such codebooks were then used to word recognition experime...

متن کامل

Automatic Speaker Recognition Using Fuzzy Vector Quantization

Speaker recognition (SR) is a dynamic biometric task. SR is a multidisplinary problem that encompasses many aspects of human speech, including speech recognition, language recognition, and speech accents. This technique makes it possible to use the speaker’s voice to verify his/her identity and provide controlled access to services. The Mel-frequency extraction method is leading approach for sp...

متن کامل

Unsupervised Domain Adaptation for I-vector Speaker Recognition

In this paper, we present a framework for unsupervised domain adaptation of PLDA based i-vector speaker recognition systems. Given an existing out-of-domain PLDA system, we use it to cluster unlabeled in-domain data, and then use this data to adapt the parameters of the PLDA system. We explore two versions of agglomerative hierarchical clustering that use the PLDA system. We also study two auto...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Signal Processing Letters

سال: 2021

ISSN: ['1558-2361', '1070-9908']

DOI: https://doi.org/10.1109/lsp.2021.3091932